Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting

نویسندگان

  • David J. DeWitt
  • Jeffrey F. Naughton
  • Donovan A. Schneider
چکیده

We consider the problem of external sorting in a shared-nothing multiprocessor. A critical step in the algorithms we consider is to determine the range of sort keys to be handled by each processor. We consider two techniques for determining these ranges of sort keys: exact splitting, using a parallel version of the algorithm proposed by Iyer, Ricard, and Varman; and probabilistic splitting, which uses sampling to estimate quantiles. We present analytic results showing that probabilistic splitting performs better than exact splitting. Finally, we present experimental results from an implementation of sorting via probabilistic splitting in the Gamma parallel database machine.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Sorting on a Shared-Nothing Architecture using Probabilistic Splitting

Sorting large datasets is often limited by I/O bandwidth in terms of memory and disk. The traditional von Neumann architecture results in high cache misses within L1-L3 cache levels. The authors realized that the highly parallelized processors and the fast memory interconnects inside commodity GPUs can help to work around the limitations that arise when sorting is done solely on the CPU. Specif...

متن کامل

A parallel sort-balance mutual range-join algorithm on hypercube computers

This paper presents an eecient parallel algorithm for computing the mutual range-join of N sets of numbers on shared-nothing hypercube computers. The algorithm iteratively joins each set to the mutual range-join of the preceding sets. Each join is performed on all processors of the hypercube in parallel. The algorithm uses a global sorting method to distribute the elements of the rst set evenly...

متن کامل

Performance Evaluation of a Two-Level Hierarchical Parallel Database System

Two typical architectures of parallel database systems are the shared-everything and shared-nothing architectures. Shared-everything architecture provides better performance than the shared-nothing architecture but it is not scalable to large system sizes. On the other hand, shared-nothing architecture provides good system scalability but is sensitive to data skew. Hierarchical architectures ha...

متن کامل

Stack splitting: A technique for efficient exploitation of search parallelism on share-nothing platforms

We study the problem of exploiting parallelism from search-based AI systems on share-nothing platforms, i.e., platforms where different machines do not have access to any form of shared memory. We propose a novel environment representation technique, called stack-splitting, which is a modification of the well-known stack-copying technique, that enables the efficient exploitation of or-paralleli...

متن کامل

A Scalable Parallel Sorting Algorithm Using Exact Splitting

Sorting is one of the most fundamental algorithmic kernels, used by a large fraction of computer applications. This paper proposes a novel parallel sorting algorithm based on exact splitting that combines excellent scaling behavior with universal applicability. In contrast to many existing parallel sorting algorithms that make limiting assumptions regarding the input problem or the underlying c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991